Without-Replacement Sampling for Stochastic Gradient Methods: Convergence Results and Application to Distributed Optimization

نویسنده

  • Ohad Shamir
چکیده

Stochastic gradient methods for machine learning and optimization problems are usually analyzed assuming data points are sampled with replacement. In practice, however, sampling without replacement is very common, easier to implement in many cases, and often performs better. In this paper, we provide competitive convergence guarantees for without-replacement sampling, under various scenarios, for three types of algorithms: Any algorithm with online regret guarantees, stochastic gradient descent, and SVRG. A useful application of our SVRG analysis is a nearly-optimal algorithm for regularized least squares in a distributed setting, in terms of both communication complexity and runtime complexity, when the data is randomly partitioned and the condition number can be as large as the data size per machine (up to logarithmic factors). Our proof techniques combine ideas from stochastic optimization, adversarial online learning, and transductive learning theory, and can potentially be applied to other stochastic optimization and learning problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Without-Replacement Sampling for Stochastic Gradient Methods

Stochastic gradient methods for machine learning and optimization problems are usually analyzed assuming data points are sampled with replacement. In contrast, sampling without replacement is far less understood, yet in practice it is very common, often easier to implement, and usually performs better. In this paper, we provide competitive convergence guarantees for without-replacement sampling...

متن کامل

Beneath the valley of the noncommutative arithmetic-geometric mean inequality: conjectures, case-studies, and consequences

Randomized algorithms that base iteration-level decisions on samples from some pool are ubiquitous in machine learning and optimization. Examples include stochastic gradient descent and randomized coordinate descent. This paper makes progress at theoretically evaluating the difference in performance between sampling withand without-replacement in such algorithms. Focusing on least means squares...

متن کامل

Toward a Noncommutative Arithmetic-geometric Mean Inequality: Conjectures, Case-studies, and Consequences

Randomized algorithms that base iteration-level decisions on samples from some pool are ubiquitous in machine learning and optimization. Examples include stochastic gradient descent and randomized coordinate descent. This paper makes progress at theoretically evaluating the difference in performance between sampling withand without-replacement in such algorithms. Focusing on least means squares...

متن کامل

Adaptive Probabilities in Stochastic Optimization Algorithms

Stochastic optimization methods have been extensively studied in recent years. In some classification scenarios such as text document categorization, unbiased methods such as uniform sampling have negative effects on the convergence rate, because of the effects of the potential outlier data points on the estimator. Consequently, it would take more iterations to converge to the optimal value for...

متن کامل

Splash: User-friendly Programming Interface for Parallelizing Stochastic Algorithms

Stochastic algorithms are efficient approaches to solving machine learning and optimization problems. In this paper, we propose a general framework called Splash for parallelizing stochastic algorithms on multi-node distributed systems. Splash consists of a programming interface and an execution engine. Using the programming interface, the user develops sequential stochastic algorithms without ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1603.00570  شماره 

صفحات  -

تاریخ انتشار 2016